首页> 外文OA文献 >An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages
【2h】

An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages

机译:朴素贝叶斯与基于关键词的反垃圾邮件的实验比较   使用个人电子邮件过滤

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The growing problem of unsolicited bulk e-mail, also known as "spam", hasgenerated a need for reliable anti-spam e-mail filters. Filters of this typehave so far been based mostly on manually constructed keyword patterns. Analternative approach has recently been proposed, whereby a Naive Bayesianclassifier is trained automatically to detect spam messages. We test thisapproach on a large collection of personal e-mail messages, which we makepublicly available in "encrypted" form contributing towards standardbenchmarks. We introduce appropriate cost-sensitive measures, investigating atthe same time the effect of attribute-set size, training-corpus size,lemmatization, and stop lists, issues that have not been explored in previousexperiments. Finally, the Naive Bayesian filter is compared, in terms ofperformance, to a filter that uses keyword patterns, and which is part of awidely used e-mail reader.
机译:不请自来的批量电子邮件(也称为“垃圾邮件”)的日益严重的问题引起了对可靠的反垃圾邮件过滤器的需求。到目前为止,这种类型的过滤器主要基于手动构建的关键字模式。最近提出了一种替代方法,通过该方法可以自动训练朴素贝叶斯分类器来检测垃圾邮件。我们在大量个人电子邮件消息上测试了此方法,我们以“加密”形式公开提供了这些消息,以帮助实现基准测试。我们引入了适当的成本敏感措施,同时调查了属性集大小,训练语料库大小,残化和停止列表的影响,而这些都是先前实验中未曾探讨过的问题。最后,就性能而言,将朴素贝叶斯过滤器与使用关键字模式的过滤器进行比较,该过滤器是广泛使用的电子邮件阅读器的一部分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号